Ted Laderas
2019-04-08
| VariableName | Definition |
|---|---|
| Age | Age in years at screening of study participant. Note: Subjects 80 years or older were recorded as 80. |
| AlcoholDay | Average number of drinks consumed on days that participant drank alcoholic beverages. Reported for participants aged 18 years or older. |
| BMI | Body mass index (weight/height2 in kg/m2). Reported for participants aged 2 years or older. |
| BMI_WHO | Body mass index category. Reported for participants aged 2 years or older. One of 12.0_18.4, 18.5_24.9, 25.0_29.9, or 30.0_plus. |
| BPSysAve | Combined systolic blood pressure reading, following the procedure outlined for BPXSAR. |
| Depressed | Self-reported number of days where participant felt down, depressed or hopeless. Reported for participants aged 18 years or older. One of None, Several, Majority (more than half the days), or AlmostAll. |
| Education | Educational level of study participant Reported for participants aged 20 years or older. One of 8thGrade, 9-11thGrade, HighSchool, SomeCollege, or CollegeGrad. |
| Gender | Gender (sex) of study participant,coded as male or female |
| HardDrugs | Participant has tried cocaine, crack cocaine, heroin or methamphetamine. Reported for participants aged 18 to 69 years as Yes or No. |
| HHIncome | Total annual gross income for the household in US dollars. One of 0 - 4999, 5000 - 9,999, 10000 - 14999, 15000 - 19999, 20000 - 24,999, 25000 - 34999, 35000 - 44999, 45000 - 54999, 55000 - 64999, 65000 - 74999, 75000 - 99999, or 100000 or More. |
| LittleInterest | Self-reported number of days where participant had little interest in doing things. Reported for participants aged 18 years or older. One of None, Several, Majority (more than half the days), or AlmostAll. |
| Marijuana | Participant has tried marijuana. Reported for participants aged 18 to 59 years as Yes or No. |
| MaritalStatus | Marital status of study participant. Reported for participants aged 20 years or older. One of Married, Widowed, Divorced, Separated, NeverMarried, or LivePartner (living with partner). |
| Race1 | Reported race of study participant: Mexican, Hispanic, White, Black, or Other. |
| Race3 | Reported race of study participant, including non-Hispanic Asian category: Mexican, Hispanic, White, Black, Asian, or Other. Not availale for 2009-10. |
| RegularMarij | Participant has been/is a regular marijuana user (used at least once a month for a year). Reported for participants aged 18 to 59 years as Yes or No. |
| AgeRegMarij,"Age | of participant when first started regularly using marijuana. Reported for participants aged 18 to 59 years. |
| SleepHrsNight | Self-reported number of hours study participant usually gets at night on weekdays or workdays. Reported for participants aged 16 years and older. |
| SleepTrouble | Participant has told a doctor or other health professional that they had trouble sleeping. Reported for participants aged 16 years and older. Coded as Yes or No. |
| SurveyYr | Which survey the participant participated in. |
| TotChol | Total HDL cholesterol in mmol/L. Reported for participants aged 6 years or older. |
| TVHrsDay | Number of hours per day on average participant watched TV over the past 30 days. Reported for participants 2 years or older. One of 0_to_1hr, 1_hr, 2_hr, 3_hr, 4_hr, More_4_hr. Not available 2009-2010. |
We can understand an outcome and look at its association with measured variables in the data.
We’ll look at Depression today, but there is also Physical Activity and Diabetes Status as well
Pair off or get into groups of 3.
Given the list of variables, come up with one question about your outcome you’re curious about.
What do you expect is the case?
See if you can answer it!
“Exploratory data analysis can never be the whole story, but nothing else can serve as the foundation stone.” - John Tukey, Exploratory Data Analysis
We’ll start exploring the data immediately!
Go to the app:
Data Explorer
Depressed variable? (in R, we call them factors)Depressed?Depressed variable defined in this dataset?Do people with the most days of LittleInterest also have the most days of Depression?
Depressed?Many reasons for the data being missing from a variable!
Depends on what you want to do:
If you get less hours of sleep per night, does that mean you have a higher BMI?
If you have a lot of depressed episodes, do you also get less sleep?
Age in the dataset?BMI go up?Depressed people have a higher systolic blood pressure than non depressed people?LittleInterest with Depressed?Each group should present the findings from 1 interesting question:
Depressed variable with it.You are now a full fledged data explorer!
R package that lets you explore your data:
http://laderast.github.io/burro
Are people interested in an optional session?
You’re convinced that the effect you’re interested is real.